fix(tests): use urlparse for host assertions (CodeQL #88-#91) by danielmeppiel · Pull Request #1109 · microsoft/apm

danielmeppiel · 2026-05-02T16:38:52Z

TL;DR

Resolves four high-severity CodeQL py/incomplete-url-substring-sanitization alerts (#88, #89, #90, #91) by switching four substring assert "gitlab.com" in <text> checks in the marketplace test suite to set-equality on urllib.parse-normalised host tokens, per the canonical pattern in .github/instructions/tests.instructions.md.

Problem (WHY)

CodeQL flagged four assertions in tests/unit/marketplace/test_marketplace_commands.py (lines 300, 308, 402) and tests/unit/marketplace/test_marketplace_client.py (line 497):

assert "gitlab.com" in result.output

The rule fires because substring-checking a host name inside a URL-bearing string is the same code shape as a security-critical sanitiser (e.g. evil-gitlab.com would also pass "gitlab.com" in url). The static analyser cannot tell test assertions and production sanitisers apart, so the rule is intentionally conservative.

The repo's canonical lint contract for tests (.github/instructions/tests.instructions.md) explicitly forbids this pattern even in "obviously safe" tests:

Substring assertions like assert "host.example.com" in msg are flagged by CodeQL as py/incomplete-url-substring-sanitization (high severity, "the string may be at an arbitrary position in the URL") and will fail CI.

Approach (WHAT)

Add a small _quoted_hosts(text: str) -> set[str] helper to each affected test file. The helper:

Normalises each token through urllib.parse.urlparse so the assertion compares on parsed hostnames, not raw substrings.
Returns a set[str], so call sites assert with set equality (==), not membership (in) -- equality avoids the residual string in set shape that the canonical instructions note CodeQL still flags.

Call sites change from:

assert "gitlab.com" in result.output

to:

assert _quoted_hosts(result.output) == {"gitlab.com"}

This mirrors the canonical _printed_urls helper in tests/unit/test_mcp_command.py (referenced by the tests-instructions file) but specialised for the Host '...' error pattern these marketplace tests assert against.

Behaviour preserved

Each test still verifies the rejected host name is named in the user-facing error.
"not supported" / "not a supported marketplace source" phrase checks unchanged (plain English, not URL substrings, not flagged by the rule).
"credential" / "leak" negative assertions in test_untrusted_host_error_has_action_in_first_sentence unchanged.

Validation

uv run --extra dev pytest -k "test_add_rejects_non_github_host_with_actionable_error or test_add_rejects_non_github_host_shorthand or test_untrusted_host_error_has_action_in_first_sentence or test_generic_host_rejected_before_request" -- 4 passed.
uv run --extra dev pytest tests/unit/marketplace/ -- 905 passed.
uv run --extra dev ruff check src/ tests/ -- silent.
uv run --extra dev ruff format --check src/ tests/ -- 626 files already formatted.

How to test

uv run --extra dev pytest tests/unit/marketplace/ -x
uv run --extra dev ruff check src/ tests/
uv run --extra dev ruff format --check src/ tests/

After merge, the four CodeQL alerts (#88, #89, #90, #91) close on the next codeql workflow run against main.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Copilot

Pull request overview

This PR updates the test suite to satisfy CodeQL's py/incomplete-url-substring-sanitization rule by replacing URL substring assertions with urlparse-based hostname extraction, and also introduces Cursor slash-command deployment support (plus associated tests, docs, and changelog updates).

Changes:

Replace "gitlab.com" in ... assertions with parsed-host set equality in marketplace unit tests.
Add Cursor .cursor/commands/*.md deployment via the shared command integrator, including traversal guards and richer diagnostics.
Update integration tests, documentation pages, and the changelog to reflect Cursor command support.

Show a summary per file

File	Description
tests/unit/marketplace/test_marketplace_commands.py	Adds `_quoted_hosts()` helper and updates host assertions to use parsed hostnames.
tests/unit/marketplace/test_marketplace_client.py	Adds `_quoted_hosts()` helper and updates exception message assertions to use parsed hostnames.
tests/unit/integration/test_data_driven_dispatch.py	Extends expected bucket list to include `commands_cursor`.
tests/unit/integration/test_command_integrator.py	Refactors duplicated package setup and adds Cursor command integration/dispatch/sync tests.
src/apm_cli/integration/targets.py	Adds Cursor `commands` primitive mapping to `.cursor/commands/` (shared `claude_command` format).
src/apm_cli/integration/command_integrator.py	Adds Cursor command integration path, traversal containment checks, and additional diagnostics.
src/apm_cli/integration/base_integrator.py	Adds containment guard in file discovery to prevent resolved-path escapes from package root.
src/apm_cli/commands/install.py	Reformats `_post_install_summary` signature for readability (no behavior change).
packages/apm-guide/.apm/skills/apm-usage/package-authoring.md	Documents slash-command deployment locations across supported targets.
docs/src/content/docs/reference/cli-commands.md	Adds Cursor commands destination path to uninstall reference table.
docs/src/content/docs/introduction/what-is-apm.md	Updates Cursor capabilities table to include `.cursor/commands/`.
docs/src/content/docs/introduction/how-it-works.md	Updates target auto-detection/integration list to include Cursor commands.
docs/src/content/docs/integrations/ide-tool-integration.md	Documents Cursor commands deployment behavior and frontmatter normalization semantics.
docs/src/content/docs/enterprise/security.md	Adds a new “Slash command deployment” section and target/path table.
CHANGELOG.md	Adds an Unreleased entry describing Cursor slash command support.

Copilot's findings

Comments suppressed due to low confidence (2)

src/apm_cli/integration/command_integrator.py:473

The one-shot passthrough notice claims the shared transformer keeps a frontmatter subset including input, but input is actually mapped to arguments and not written out. Consider adjusting the message to avoid implying input remains in the generated command files.

            diagnostics.info(
                message=(
                    f"{target.name.capitalize()} command files keep "
                    f"Claude-compatible frontmatter (description, "
                    f"allowed-tools, model, argument-hint, input) "
                    f"intentionally for cross-tool compatibility."
                ),

docs/src/content/docs/enterprise/security.md:248

The commands path for OpenCode is documented as .opencode/command/*.md, but the actual integration target directory is .opencode/commands/*.md (plural). This looks like a typo and could mislead users setting up the directory structure.

| **OpenCode** | `.opencode/command/*.md` | Deployed when `.opencode/` exists. |

Files reviewed: 15/15 changed files
Comments generated: 9

+                    f"frontmatter keys not written by the shared command "
+                    f"transformer: {', '.join(dropped_keys)}. "
+                    f"Only description, allowed-tools, model, argument-hint, "
+                    f"and input are preserved."


+                # Containment guard: skip files whose resolved path
+                # escapes the package root.  Hardlinks are not symlinks
+                # so the is_symlink() check above does not catch them.
+                # See path_security.ensure_path_within for the canonical
+                # predicate; we inline the check here to stay loop-fast
+                # and avoid raising on every malicious entry.
+                try:
+                    pkg_resolved = package_path.resolve()
+                    if not resolved.is_relative_to(pkg_resolved):
+                        continue
+                except (ValueError, OSError):
+                    continue


 |----------|---------|
 | `.cursor/rules/*.mdc` | Instructions converted to Cursor rules format |
 | `.cursor/agents/*.md` | Sub-agents from installed packages |
+| `.cursor/commands/*.md` | Slash commands from installed packages (from `.prompt.md` files). Files are deployed when `.cursor/` exists. Frontmatter is normalized to the common Claude-compatible subset (`description`, `allowed-tools`, `model`, `argument-hint`, `input`); Cursor-specific keys (`author`, `mcp`, `parameters`, ...) are dropped with an install-time warning per file. **Lifecycle note:** Cursor 1.6+ only. Cursor is de-emphasizing commands in favor of rules and skills -- monitor [Cursor release notes](https://cursor.com/changelog) for changes to this surface. |


+            # TODO(cursor-command-format): track via dedicated issue once
+            # filed.  Cursor command deployment reuses the shared command
+            # transformer (claude_command), which preserves only the
+            # supported common frontmatter subset (description,
+            # allowed-tools, model, argument-hint, input).  Switch to a
+            # dedicated "cursor_command" format when the integrator
+            # implements a Cursor-specific writer that preserves
+            # Cursor-specific prompt metadata (author, mcp, parameters,
+            # ...) verbatim.  Dropped keys are surfaced via
+            # diagnostics.warn() at install time -- see
+            # command_integrator.
+            "commands": PrimitiveMapping("commands", ".md", "claude_command"),


+            # TODO(cursor-command-format): track via dedicated issue once
+            # filed.  Cursor command deployment reuses the shared command
+            # transformer (claude_command), which preserves only the
+            # supported common frontmatter subset (description,
+            # allowed-tools, model, argument-hint, input).  Switch to a
+            # dedicated "cursor_command" format when the integrator
+            # implements a Cursor-specific writer that preserves
+            # Cursor-specific prompt metadata (author, mcp, parameters,
+            # ...) verbatim.  Dropped keys are surfaced via
+            # diagnostics.warn() at install time -- see
+            # command_integrator.
+            "commands": PrimitiveMapping("commands", ".md", "claude_command"),


+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        # Track which (target_name) values have already received the
+        # one-shot Claude-frontmatter-passthrough notice so the message
+        # fires once per (install run, target), not once per package or
+        # once per file.  Reset implicitly when a new integrator is
+        # constructed (one per install run).
+        self._passthrough_notified: set[str] = set()
+


+| Target | Commands directory | Notes |
+|--------|--------------------|-------|
+| **Claude Code** | `.claude/commands/*.md` | Deployed when `.claude/` exists. |
+| **Cursor** | `.cursor/commands/*.md` | Deployed when `.cursor/` exists. Cursor 1.6+ only; Cursor is de-emphasizing commands in favor of rules/skills -- monitor [Cursor release notes](https://cursor.com/changelog) for changes. The shared command transformer keeps the Claude-compatible frontmatter subset (`description`, `allowed-tools`, `model`, `argument-hint`, `input`); Cursor-specific keys (`author`, `mcp`, `parameters`, ...) are dropped with an install-time warning per file. |


+        # Filename ``..prompt.md`` strips to base_name "." which is a
+        # traversal segment that validate_path_segments must reject.


 - **`apm pack` marketplace builder hardening.** Local source paths are now emitted relative to `metadata.pluginRoot` (fixes double-prefix bug). New pass-through fields: `author`, `license`, `repository`, `keywords` (alias for `tags`). Curator-wins override semantics for `description`/`version` on remote entries. Security guards reject path traversal and absolute paths post-subtraction. (#1061)
 - **Plugin manifest schema-conformance tests.** `tests/unit/test_plugin_exporter_schema.py` validates every shape of `plugin.json` produced by `apm pack` (synthesized, authored, and authored-with-stale-keys) against the vendored official schema. Companion marketplace conformance lives in `tests/unit/marketplace/test_schema_conformance.py`. (#1061)
 - Slash commands installed from APM packages now surface argument hints in Claude Code -- `apm install` automatically maps prompt `input:` to Claude's `arguments:` front-matter, rewrites `${input:name}` references to `$name`, and auto-generates `argument-hint`. Argument names are validated against an allowlist to prevent YAML injection from third-party packages, and the mapping is reported at install time. (#1039)
+- **Cursor slash command support:** `apm install` now deploys package `.prompt.md` files to `.cursor/commands/*.md` when a `.cursor/` directory is present, registering them as Cursor 1.6+ slash commands. Common Claude-compatible frontmatter (`description`, `allowed-tools`, `model`, `argument-hint`, `input`) is preserved; Cursor-specific keys (`author`, `mcp`, `parameters`, ...) are dropped with an install-time `diagnostics.warn()` per file so authors see the lossy transform. Cursor users now get the same slash-command UX as Claude/OpenCode/Gemini users out of the box. Note: Cursor 1.6+ only; Cursor is de-emphasizing commands in favor of rules/skills, so monitor Cursor release notes for changes to this surface. (#1046)


CodeQL py/incomplete-url-substring-sanitization (alerts #88, #89, #90, #91) flagged four assertions that substring-checked 'gitlab.com' inside CLI output / exception text. Per .github/instructions/tests.instructions.md URL/host assertions must compare on a parsed component, not a raw substring. Adds a small _quoted_hosts() helper at the top of each affected test file that regex-extracts `Host '<host>'` tokens, normalises each via urllib.parse.urlparse, and returns a set. Call sites switch from `assert 'gitlab.com' in output` to `assert _quoted_hosts(output) == {'gitlab.com'}` -- set equality on parsed hostnames, the canonical pattern from the tests-instructions file. Behaviour preserved: - Tests still verify the host name is named in the user-facing error. - 'not supported' / 'not a supported marketplace source' phrase checks unchanged (plain English, not host substrings). - 'credential' / 'leak' negative assertions unchanged. Verified: - Targeted 4 tests: pass. - Full tests/unit/marketplace/ suite: 905 passed. - ruff check + ruff format --check: silent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 2, 2026 16:38

danielmeppiel requested a review from sergio-sisternes-epam as a code owner May 2, 2026 16:38

Copilot started reviewing on behalf of danielmeppiel May 2, 2026 16:39 View session

Copilot AI reviewed May 2, 2026

View reviewed changes

danielmeppiel force-pushed the fix/codeql-url-substring-marketplace-tests branch from 42641d8 to 3fff1fb Compare May 2, 2026 16:47

Merge branch 'main' into fix/codeql-url-substring-marketplace-tests

af64e89

danielmeppiel merged commit e99fc84 into main May 2, 2026
8 checks passed

danielmeppiel deleted the fix/codeql-url-substring-marketplace-tests branch May 2, 2026 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tests): use urlparse for host assertions (CodeQL #88-#91)#1109

fix(tests): use urlparse for host assertions (CodeQL #88-#91)#1109
danielmeppiel merged 2 commits into
mainfrom
fix/codeql-url-substring-marketplace-tests

danielmeppiel commented May 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Filename ``..prompt.md`` strips to base_name "." which is a
		# traversal segment that validate_path_segments must reject.

Conversation

danielmeppiel commented May 2, 2026

TL;DR

Problem (WHY)

Approach (WHAT)

Behaviour preserved

Validation

How to test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants